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We investigate the dynamics of a broad class of stochastic copying processes on a network that 
includes examples from population genetics (spatially-structured Wright-Fisher models), ecology 
(Hubbell-type models) , linguistics (the utterance selection model) and opinion dynamics (the voter 
model) as special cases. These models all have absorbing states of fixation where all the nodes are 
in the same state. Earlier studies of these models showed that the mean time when this occurs can 
be made to grow as different powers of the network size by varying the the degree distribution of 
the network. Here we demonstrate that this effect can also arise if one varies the asymmetry of the 
copying dynamics whilst holding the degree distribution constant. In particular, we show that the 
mean time to fixation can be accelerated even on homogeneous networks when certain nodes are 
very much more likely to be copied from than copied to. We further show that there is a complex 
interplay between degree distribution and asymmetry when they may co-vary; and that the results 
are robust to correlations in the network or the initial condition. 
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I. INTRODUCTION 

One of the central themes in the application of 
the ideas and techniques of non-equilibrium statistical 
physics to the modeling of biological and social systems, 
is that of agents interacting through a network of links 
(HQ. The a g en ts may be individuals, species, companies, 
or other kinds of entity, and the nodes of the network may 
consist of one or many agents, but the general idea is the 
same. An agent at one node i interacts with another at 
node j if a link joining the two nodes is present. The 
probability of interaction may depend on the strength of 
the link or on the properties of the agents themselves. 
This stochastic dynamics may also include the birth or 
death of agents, their transformation from one type to 
another, or other more complicated processes. 

In very many applications, the network structure is 
defined through a single symmetric matrix Gij, whose 
entries give the strength of the link joining node i to 
node j. Quantitatively, this strength might specify the 
frequency that the two agents at sites i and j come to- 
gether to interact. Most simply, the entries may be zero 
if the link is absent and one if it is present: G is then the 
adjacency matrix for the network. However in some sys- 
tems, particularly in the social sciences, even variation 
in link strength or interaction frequency is not the whole 
story. Individuals may interact strongly or weakly, fre- 
quently or infrequently, and the nature of the interaction 
may, for instance, be antagonistic, neutral or reinforcing, 
or one of the agents may have significantly more impact 
than the other. To model these aspects, one may define 
another matrix, Hy, which quantifies the nature of the 
influence that an agent at node j has on one at node i. A 
key property of the H matrix that distinguishes it from 
G is that it need not be symmetric: agent i may have 
much more influence on agent j than vice versa. 



This decomposition of interactions into symmetric and 
asymmetric parts turns out to be extremely natural in 
the case of the utterance selection model for language 
change that we introduced a number of years ago [3[ . In 
this model, the nodes of the network represent speakers 
who have the possibility of saying the same thing in two 
(or more) different ways. The process of language change 
is assumed to be the consequence of repeated face-to-face 
interactions between speakers, and so the frequency Gij 
that the pair of individuals interacts must neces- 

sarily be symmetric. However the weight that individual 
i gives to the utterances of individual j may depend on 
factors other than the frequency of interaction, such as 
the relative social standing. Whilst the frequency that 
i interacts with j must necessarily equal the frequency 
that j interacts with i, there is no reason why the agents 
should judge each other to be of similar social standing. 
Such asymmetric effects can enter only via the matrix 
Hij. 

In this work, we systematically investigate the effect 
that varying the asymmetry (the matrix H) has on the 
dynamics of the utterance selection model. The basic 
microscopic process at work in this model is one agent 
replicating the behavior that another agent has previ- 
ously exhibited. As such, the utterance selection model 
is a member of a much larger class of stochastic copy- 
ing processes. Other models within this class include the 
Wright-Fisher model for changes in gene frequencies in a 
population (341] , Hubbell's model for species diversity in 
an ecological community [U and the voter model that has 
been widely studied by statistical physicists as a baseline 
model of opinion dynamics |10H14{ . 

We remark that in many of these cases, the asymme- 
try encoded in the matrix H is a side-effect of the micro- 
scopic update rule that defines the model, as opposed to 
a quantity that can be varied independently in its own 
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right. For example, the voter model is defined in terms of 
the following update: first, a site of the network is chosen 
at random, and then the state of that site is updated to 
match that of a randomly-chosen neighbor. This choice 
of update rule then implies that well-connected nodes 
are much more influential than poorly-connected nodes, 
as they are more likely to be copied from than copied 
to. Nevertheless, the network structure of such a model 
(the matrix G) is easily varied, and by doing so it has 
been found that the mean time to reach a state where all 
nodes have the same state — variously known as fixation, 
consensus or complete order — can grow as different pow- 
ers of the network size N [ITl - flij depending on the level 
of heterogeneity in the network structure. 

By exploiting the clean separation of network struc- 
ture G and interaction asymmetry H afforded by the 
utterance selection model, we show that, even when the 
network structure is homogeneous, disparities in the im- 
pact of different agents, as expressed through the H ma- 
trix, may drive the system more quickly to fixation than 
when such disparities are absent. Thus we may arrive at 
fast fixation without the need for special 'fast' network 
structures, as observed in previous works, if we instead 
manipulate the asymmetry in the interactions. One can 
of course also consider the case where network structure 
(G) and asymmetry (H) co-vary. As we discuss later in 
this work, this leads to a wide variety of scaling relations 
between the network size and the mean time to reach 
fixation. 

This work builds on our earlier investigations of the 
utterance selection model, in which we introduced the 
model and studied the case of a fully-connected network 
with a constant Hij 3], investigated its application to 
the emergence of New Zealand English [la ], and stud- 
ied the effect of the network structure on the mean time 
to fixation [la) . In the latter paper we considered the 
model in a broader context, which included models of 
population ecology and population genetics, where the 
G and H matrices appeared together in a migration ma- 
trix mij = GijHij. We showed that if my was sym- 
metric, then the mean time to fixation was essentially 
independent of the network structure. Since Gy is sym- 
metric, fixation times much shorter or longer than this 
can o nly be found if ify is not symmetric. However, in 
Ref. 151, we found that short fixation times would be 
needed to explain the rapid emergence of New Zealand 
English. We therefore postulated that there must have 
been a fraction of individuals in the population who had, 
for instance, greater influence than average, leading to 
a skewed distributed for i?y, giving a let-out from the 
results of Ref. [16( . The effect of these skewed distribu- 
tions on the mean time to fixation form the focus of this 
present work. 

The outline of the paper is as follows. In Sec. II we de- 
fine the model and further develop the formalism that we 
will use in the rest of the paper. In Sec. Ill and IV this 
is applied to investigate how the structure of the matri- 
ces Gij and Zfy influence the long time dynamics of the 



model. We consider two distinct cases: one in which the 
influence encoded in iJy is independent of the network 
structure described by Gy, and another in which they 
are directly related to one another. In the former case 
we find that influence may accelerate the fixation process; 
while in the latter we find a wide variety of behavior that 
is summarized in Fig. Q] We conclude in Sec. V with a 
summary of our findings and how they relate to studies 
of similar models. An Appendix contains some useful 
mathematical results that are employed in Sees. II and 
III. 



II. MODEL AND FORMALISM 

The system, when expressed in terms of the model 
of language change mentioned in the Introduction 
consists of N speakers, whose frequency of interaction 
is given by a matrix G. More specifically, speakers i 
and j interact with a probability Gy , normalized so that 
Gij — 1, where (ij) refers to distinct pairs i and j. 
In this simple version of the model, we will only moni- 
tor the frequency with which two different ways of saying 
the same thing spreads through the speaker community. 
That is, as in Q, we will focus only on a single expres- 
sion with two variants, or linguemes, which we denote as 
a and b. 

The state of the system is completely specified by the 
probabilities for each speaker to utter the a variant at 
a given time t. These will be denoted by Xi(t); the 
rule by which they are determined is given below. We 
will frequently express the overall state of the system as 
x = (x%, . . . ,xn)- The second matrix mentioned in the 
Introduction, H, specifies how much weight i gives to the 
utterances of j. With the structure of the model in place, 
it remains for the dynamics to be specified. The evolution 
of language use will be taken to be usage-based 
a speaker will be influenced by the extent to which a 
particular variant is used by the speaker he is in conver- 
sation with. In the formulation introduced in Q, it was 
assumed that a conversation would consist of T tokens, 
i.e. instances of use, of the particular word or expres- 
sion that is of interest. Here we will simply take T = 1. 
This choice will not significantly change the nature of 
the dynamics, and moreover the choice of T amounts to 
a rescaling of G and iJ, and so it may be reintroduced at 
any time by performing these rescalings. 

The actual production process is expected to be 
stochastic [l7|, and therefore we adopt the rule that at 
time t speaker i produces variant a with probability Xi (t) 
and variant b with probability (1 — Xi{t)). This is rep- 
resented by a stochastic variable so that Q — 1 with 
probability Xi(t) and zero otherwise. The change in the 
grammar of speaker i due to an interaction (conversa- 
tion) with speaker j will be of the form (Q + i?y£j). The 
first term is the result of speaker i uttering an a vari- 
ant (Ci = 1) or not (i.e. uttering a b variant, Q = 0) 
and the second the result of speaker j uttering an a vari- 
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ant (Q = 1) or not (Q — 0). The weight given to the 
utterance of j by i is the factor H^ discussed above. 

There are two further factors that have to be intro- 
duced. First, we multiply the above interaction term by 
a constant A, which is taken to be small, since gram- 
matical changes as a result of a single conversation will 
typically be small. Second, we have decided to choose the 
random variable £ to be one or zero, following the origi- 
nal model [|| . Using our convention, the overall value of 
Xi has to be renormalized (by a factor of [1 + A(l + Hij)]) 
at each update. An alternative choice would be to take 
£ to be one or minus one, which would avoid the need to 
normalize. With an appropriate correction to the values 
of Hij the two choices are equivalent. 

If we now assume that one conversation takes place 
during a time St, and that this conversation has been 
between the two speakers i and j, then the change in the 
grammar of speaker j as a result of this interaction will 
be 



Xi(t + St) 



l + \(l + H l3 ) : 



(1) 



with a similar equation for speaker j obtained by inter- 
changing the indices i and j. 

An alternative mechanistic description consists of 
viewing speaker i as containing a large number of ob- 
jects, of which a fraction xi(t) are of type a and a frac- 
tion (1 — Xi(t)) of type b. One object is then picked 
at random for "migration" from speaker i to another 
speaker. This was the formulation discussed in (l6| : 
speakers were viewed as islands containing individuals 
of a species which could undergo birth/death and mi- 
gration. This picture shows how other models such as 
the Wright-Fisher population genetics model [4|-l8| or the 
Hubbell ecology model [§[ may be treated with the same 
formalism we have outlined here. The relationships be- 
tween these models are discussed in more detail in (20j . 

In simulations we repeatedly use the update rule ([I]), 
after choosing the two speakers who are interacting us- 
ing the network structure matrix Gij. However, to make 
analytic progress we take St — > 0, and construct a Fokker- 
Planck equation for the Markov process ([lj . The deriva- 
tion is given in Q , where it is shown that the probability 
of the system being in state x at time t, P(x, t), satisfies 
the equation 



dP 
~dt 



= Yl 1 "' 

m 

dx 2 
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\{xi - xj) P] 
Xi(l-Xi)P], (2) 



i=l 



where G\ 



E(y> G ij and "\; 



Gij hij . 



Here ha is 



Hij rescaled in a way that is appropriate for the Fokker- 
Planck description of the model. The precise relationship 
between them is Hij — Xhij, and since by construction 
must be independent of A, when we use Hij in the 



context of a Fokker-Planck description, it is to be under- 
stood as being proportional to A. 

The Fokker-Planck equation ^ seems far too com- 
plicated to be amenable to analysis, but remarkably 
progress can made |2l|- The reason for this lies in 
the fact that after a relatively short time (compared to 
the very long fixation times that are of interest to us here) 
the change in the speakers grammars effectively become 
coupled, and their dynamics can be described by a single 
collective variable 



N 



(3) 



where Qi will be defined below. The problem now re- 
duces to one having a single degree of freedom. Methods 
based on the backward Fokker-Planck equation [2^, HH 
can then be used to obtain an expression for the mean 
time to fixation. Precise criteria for determining the va- 
lidity of this reduction to a single coordinate are given 
in [21] . Here we content ourselves with the observation 
that these criteria are usually satisfied when the network 
has sufficiently small diameter, and by checking our an- 
alytical predictions against Monte Carlo simulations. 
To define Qi we follow [16( and introduce a matrix Mij 

by 



Mij = 



-E 



if j i 



(4) 



From this it follows that • Mij = 0, that is, has 
at least one eigenvalue equal to zero (assumed unique) 
with the corresponding right-eigenvector having all ele- 
ments equal to one. The corresponding left-eigenvector 
(suitably normalized) defines Qi. 



N 



N 



QiMij = with ^ Qi = 1. 



(5) 



i=l 



i=l 



We can make some interesting observations regarding the 
dynamics of the mean of £(i), by first noting that from 
the Fokker-Planck equation @ the mean of Xi (t) evolves 
according to 



^rriij ((xj(t)) - (xi(t))) =Y,Mij{xj{t)). 



d(xj(t)) 
dt 

(6) 

This implies that the average of £ (t) is conserved by the 
dynamics: 

~~ 1^ — m — = QiMijixjW) = 0, (7) 



dt 



where we have used Eq. ([5]) . A solution of Eq. ([6]) gives 
(xi(t)) as an expansion in terms of the right eigenvectors 
of M. In the t — t oo limit only the one corresponding 
to the zero eigenvalue survives, but we have seen that all 
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the elements of this particular eigenvector are equal. So 
lim t _ ! . 00 (xj(t)) is independent of?. Since all Xi(t) tend to 
or 1 as t — >• oo, this is the probability of the variant 
a fixing. Taking the average of Eq. @, letting t — > oo, 
and using ^\ Qi — 1, we see that this is also the value 
of limt-^oo (£(£))• So the fixation probability is (£(oo)). 
However, from Eq. ([7]) we recall that (£(t)) is conserved, 
so the fixation probability is also £(0). 

These are straightforward deductions that we can make 
simply by considering the mean values of Xi(t) and £(t). 
To make further progress one has to solve the backward 
Fokker-Planck equation as indicated above. This is car- 
ried out in [l6[ , where it is shown that, under reasonable 
assumptions that are expanded on in [2l| , the mean time 
to fixation is given by 



T 



where 



[£(0)l n £(0) + (l-e(0))ln(l-£(0))], (8) 



2£ 



2E 



Gi 



(9) 



So, in principle, we can find the mean fixation time from 
a knowledge of the matrices G, and H and the vector Q. 
The first two are assumed given — they characterize the 
system under consideration. Only Q, the left eigenvector 
of M corresponding to zero eigenvalue, needs to be found. 

The next section of the paper will be devoted to an 
analytical study of this question for various choices of 
the matrix and the subsequent section to a numer- 
ical study. This latter section will both explore choices 
which cannot be treated analytically and will also be used 
to check the validity of the various approximations that 
are made in the derivations presented in the paper. How- 
ever, let us end this section by recalling the case where 
the analysis is most straightforward 16]. If m is sym- 
metric (and so H is symmetric, since G always is), then 
the right and left eigenvectors of M must be identical. 
Therefore Qi must be the same for all i and so from the 
normalization condition Qi = 1/N. The constant r is 
now given in terms of known quantities. Incidentally, in 
this case the interpretation of is especially clear — 
as a "center-of-mass" coordinate: 



1 N 



(10) 



The object of this paper is to investigate mean fixation 
times when H is not symmetric, that is, when the rela- 
tionship between speakers is not symmetric. This is what 
we now turn to. 



III. ANALYTIC CALCULATIONS OF MEAN 
FIXATION TIME 

We have seen that the case where Hij is symmetric 
leads to a Qi which is equal to 1/N for all i. If we go 



further and ask that Hj is a constant (i.e. independent 
of i and j) then we can also show that the right-hand 
side of Eq. (JSJ , and so the mean time to fixation, is inde- 
pendent of the network structure 16]. To go beyond this 
and make analytical progress we have to make specific 
assumptions for the form of Hij , Gij or both. 

One particular form for Hij which allows us to make 
such progress, is to assume that H^ is separable: H^ = 
cti(3j. This is not an unreasonable assumption; it allows 
us to look at the case where speakers are influenced by 
(the (Xi) or influence (the /3j) other speakers irrespective 
of the identity of their interlocutor. 

Under this assumption, the solution for Qi becomes 
simple: 



Qi 



Pi/Ui 



(11) 



It is straightforward to verify that this is a left- 
eigenfunction of M with zero eigenvalue. This result can 
be understood by interpreting the matrix element My as 
the rate at which a particle hops from site i to site j of 
the network. Application of a Kolmogorov criterion [24| 
then reveals that the separable form of H^ implies that 
detailed balance is satisfied, i.e., that QiMij = QjMji. 
Then Ijlip is the unique normalized solution of this set of 
equations, and we can write r explicitly as 



1 



(12) 



The fixation time is proportional to 1/r, and so we will 
focus on the calculation of r. Note, however, that £(0) 
will depend on the initial values of x. This is turn may 
have a (relatively weak) effect on the fixation time. Here 
we will assume that Xi(0) = x$\li, then £(0) = xq. 

A second assumption which allows analytic progress 
to be made is that the network of speaker interactions 
is large, random, and uncorrelated. It is then defined 
by the degrees of the nodes, that is of the speakers, and 
we write Gij oc kikj, where ki is the degree of node i. 
Since the mean node degree, fj,i, is given by iV -1 ki 
and ^2ijGij = 2, the constant of proportionality is 
2/(iV/ii) 2 , and so we have 



Gi 



2 k ^ k j 



and 



Gi 



2h 



(13) 



If we assume both the decomposition of Hij and 
Eq. (fT5|) we obtain 



1 



(14) 

Under these two approximations, we can try out dif- 
ferent schemes for the interaction weightings. We are 
mainly interested in how the fixation time scales with 
N, and are in particular looking for significant devia- 
tions from the baseline result (found when Hj is a con- 
stant) that T is proportional to TV 2 . 



5 



A. Asymmetry independent of network structure 

We first investigate the situation in which is not 
a function of degree, and hence Hij and are statis- 
tically independent quantities. We will also assume that 
the oti are all equal: on = 1, say, while the ft take on 
arbitrary values. This means that different speakers' ut- 
terances carry different weights with their audience, but 
the importance given to them does not vary from listener 
to listener. Then 



1 



(15) 



Suppose that the ft are selected from some distribution. 
Since they are selected independently from the fcj, 



(16) 



There is now only the sum on i remaining in Eq. (1151) . 
It may be written in the form 



E 



Ah [1 - 6j] 



(17) 



where Si and are proportional to fcjft/iV. So for large 
N, we may expand the summand in Eq. (1171) in powers 
of ki(3i/N to obtain 



M/3></? 2 



AT 2 Mi(/3} 2 I [1 + 208)] N^[l + 2(f3)} 

2m 3 (/? 4 ) 
(A m ) 2 [l + 2(ft]3 



(18) 



where fi n is the n th moment of the degree distribution. 

If the ft are selected from a generic distribution, such 
as a Gaussian, the moments are well behaved, that is, 
they tend to a finite value for N — > oo. This implies 
that r oc A^ 2 for large N and so the mean time to 
fixation grows as N 2 for large N. This is identical to 
that obtained from the simplest case where had no 
structure at all, and suggests that if we are to look for 
deviations from this behavior then we must investigate 
distributions where the moments depend on N in some 
way. One case in which this occurs is in 'heavy-tailed dis- 
tributions', which would correspond to our intuition that 
deviations from the N 2 behavior for the mean time to 
fixation might occur when there are members of the com- 
munity who have a much larger influence than the modal 
value. If we assume that the heavy tail has the structure 
of a power law, then we can make analytic progress, as 
discussed in the Appendix. 

Returning to Eq. (fT8|) , we choose the distribution to 
be a power law over its entire range, i.e., 



PCS) = Afi-i for p > fa , 



(19) 



and examine the dependence of r on N for different values 
of the exponent 7 using Eq. (| A7|) of the Appendix. For 



instance, when 1 < 7 < 2, the ratio of the A-dependence 
of the three terms in the large brackets in Eq. (TTS|) is 

7V(3-7)/(7-i) . jyl/h-i) : ^1/(7-1)^ and so the first term 

dominates. For 2 < 7 < 3, the first moment (ft is a 
constant, but a similar analysis shows that again the first 
term dominates. Finally, when 7 > 3, higher moments 
may also have a finite limit as N — > 00, but once again 
it is found that the first term is the most important for 
large N. Therefore for a heavy-tailed distribution of this 
kind 



4(/3 2 ) 



A 2 (ft[l + 2(ft] 



(20) 



for large N. 

Since for 7 > 3 both (ft and (ft) have finite limits as 
N — > 00, we recover the T cx N 2 result found from more 
conventional distributions. For 1 < 7 < 2, Eq. (|2"0"|) gives 
T oc N and for 2 < 7 < 3, T cx AT( 5 - 3 7)/(7-i) j and 

so in 

this range the power of N varies from 3/2 to 2, having 
the former value when 7 = 2. So, in summary, choosing 
an extreme distribution for ft of the type (fTO|) can reduce 
the growth of T with population size, the slowest growth 
(and hence the shortest fixation times) being for 7 < 2 
when T cx N . 

The complementary situation to the one we have just 
examined is to take the ft to be all equal, while the oti are 
free to vary. In this situation, some speakers give more 
attention to others' utterances, and some less, but the 
identity of their interlocutor is not taken into account. 
However in this situation the method we used when the 
at were all equal does not apply, and we have been unable 
to obtain any simple analytic results. We did carry out 
numerical simulations of this case, which are detailed in 
Section [TV] below. 



B. Asymmetry depends on speakers degree 

A more extreme situation might be engineered by con- 
sidering that a speaker's influence depends on the number 
of their interlocutors. This might be realistic if we con- 
sider that, for example, a popular speaker (i.e., one with 
many neighbors) is given more weight by her interlocu- 
tors, for example as in (25| . Alternatively, speakers might 
divide their attention between all of their interlocutors. 
The voter model described in the Introduction (see [T3| 
for a review) is an example of such a case: copying from 
a randomly chosen neighbor implies that iJy oc 1/fc,, 
so that the combined influence of agent i's neighbors is 
independent of i, no matter how well-connected she is. 

We can access a wide range of models in a systematic 
way by first supposing again that on is independent of i, 
say a = 1, and further assuming that 



ft = Ak° 



(21) 



for some constants A and a. We follow the same proce- 
dure as in Sec. IIII Al Beginning from Eq. (|15[) . we write 
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down the analog of Eq. (Tl6|) and arrive again at the sum 
in Eq. ([T7|. However, now /3?fc, is replaced by fc 2<T+1 
and 5j and are proportional to k° +1 /N. Expanding in 
powers of k" +1 /N one finds 



Ai(T + lA i 2(T + l 



7V 2 MlM 2 1 [2 M(T+1 



(22) 



For conventional degree distributions, all the moments 
tend to iV-independent values as N becomes large, and 
we have r ~ 1 /N 2 as usual. 
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FIG. 1: (Color online.) Scaling of mean time to reach fix- 
ation with population size. Shading represents the value of 
exponent v where T oc N" . The labels give expressions for v 
in each region, with black lines marking boundaries between 
regions. The diagonal hatches cover the region in which the 
approximations used are not expected to be accurate. 

Suppose however that the degree distribution obeys a 
power law. In different regions of the 7 — a plane different 
moments appearing in (|22p diverge with N. By carefully 
examining the ratios between subsequent terms in the se- 
ries, which involve ratios of moments (J-(k+i)a+k/ fJ-ka+k-i, 
we can establish that in every region the first term dom- 
inates. This then leaves us with 



4/X (T +lAt2o'+l 



(23) 



The scaling with respect to N depends on whether any 
or which combination of the moments /Ui, fi a , M2o-+i 
diverge with N [see Eq. (IA7[) ]. This divides the cr-7 
plane into a number of regions, as seen in Fig. [1] The 
mean fixation time is proportional to 1/r, so finding the 
population size dependence of Eq. (f2"3"f immediately gives 
us the scaling of T with N. In general T oc N v , and we 
give expressions for v in the various regions in Fig. [T] We 
see that in a large area, v — 2 as in the standard case 
of Hij all equal. For 7 < 3 and a < the mean time to 




800 1000 



FIG. 2: Mean fixation time as a function of population size 
for Hij independent of degree, as described in Sec. MI Al 
Results are for a fully connected network with /3; following a 
power-law distribution with decay exponent 7 = 2.0, 2.2, 2.6 
and constant «i. Markers are average fixation times for 5000 
numerical runs. Solid lines are expected scaling as given by 
Eq. (|20fl . dashed lines are best fit curves of the form T = AN*- . 



fixation may grow faster than TV 2 . On the other hand, 
for a > and above the line a = 7 — 1 , T may grow more 
slowly than iV 2 , with the slowest growth rate T oc iV 1 / 2 
being achieved when 7 = 3 for a > 2 (though, as we 
will see, our approximations start to break down when 
v<l). 

In principle one could also consider further variations, 
such as on which are inversely proportional to degree (as 
in the voter model, or the uniform listening scenario) and 
so on. These we investigate primarily through numerical 
simulations, as described below. 



IV. NUMERICAL CALCULATIONS OF MEAN 
FIXATION TIME 



To check these calculations, and to explore the robust- 
ness of our results when assumptions we have made are 
relaxed, we performed Monte Carlo simulations of the 
stochastic algorithm described in Sec. [TTJ Explicitly, in 
each update, we selected a pair of speakers i and j from 
the distribution Gij, generated an utterance £ for each 
speaker, and then applied the update rule ((TJ to both 
speakers. This update was repeated until a state of fixa- 
tion was reached; the mean time to reach fixation is then 
obtained by averaging over multiple runs. Unless other- 
wise stated, we used homogeneous initial conditions, that 
is, all Xi(0) are initially set to the same value xq. 
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A. Check of analytical results 

We first performed numerical simulations of the sit- 
uations described in Sees. IIII Al and IIII Bl to check our 
results. We set ify = atiftj, and held the a* values con- 
stant. For the results shown in Fig. [2] we considered a 
fully connected network of speakers, that is, each speaker 
is equally likely to speak with each of the other speak- 
ers, and chose the ft from a power-law distribution for 
various values of the power-law exponent 7. 

We found that the agreement with the predictions of 
Eq. (|20l) was very good so long as the predicted expo- 
nent of growth of T with N was greater than 1, that is 
for 7 > 2. This includes the region 2 < 7 < 3, in which 
the mean fixation time, T, grows more slowly with N 
than in the usual situation where T cx N 2 . That is, the 
mean time to fixation may be reduced without recourse 
to any special network structure, merely by allowing het- 
erogeneity in the response of speakers to the utterances 
of their interlocutors. 

For 7 < 2, Eq. predicts T cx N. As we we ap- 
proach this region, we find the theoretical predictions 
break down. This can be seen in the lowest set of data in 
the figure. This is not unexpected, if we consider the ap- 
proximations made to derive our estimates of the mean 
fixation time. We have assumed that there is a short 
relaxation period after which the dynamics can be well 
described by considering only the collective variable £ 
(see [llj for details). Our calculated fixation times are 
only for this second stage. Typically the initial relax- 
ation happens in a time of order N. We see that if the 
calculated fixation time is of a similar time scale, the ini- 
tial relaxation can no longer be ignored. This is the case 
whenever v approaches 1 when T cx N u . 

Similar results were obtained for a sparse interaction 
network in which each speaker had approximately an 
equal number of neighbors. Thus shortened fixation 
times are not a consequence of all agents being able to 
interact with all other agents. 

In Fig. [3] we present simulation results for the situation 
in which ft does depend on the speaker degree. Specif- 
ically, speakers were placed on an uncorrclatcd random 
network whose degree distribution follows a power law 
with exponent 7. These networks were generated using 
the modified configuration model described in [2(|. We 
then set ft = kf . The results shown are for various lo- 
cations in the 7-17 plane (see Fig.[JJ). The mean fixation 
time grows with population size as T cx N v with the 
value of v depending on the parameters 7 and a. The 
numerical results are in excellent agreement with the v 
values predicted by Eq. (|2"3l for values both smaller and 
larger than the baseline value of 2. As before, we found 
that the agreement fails when the predicted value of £ is 
1 or less. This occurs in the region marked with diagonal 
hatching in Fig. [TJ 



B. Robustness of the analytical results 

We now discuss cases where the conditions for our an- 
alytical results, Eqs. (|2"0)) and (|2"3"|) do not hold, but we 
see nevertheless somewhat similar behavior. 

First we investigated the effects of fixed ft and hetero- 
geneous QLi (on a homogeneous network). By examining 
Eq. (fbl) in this case, we see that it is the smallest val- 
ues of a.i which contribute most to r. In fact we find 
that r ~ (l/a)/N 2 . This result is similar to that found 
in [21], [H| where different agents in the network could 
change state with different rates: this is one way to in- 
terpret variation of the a parameter in the present work. 

In this context, we considered a power-law distribution 
of values, such that P(a) cx cT 1 . The moment (1/a) is 
independent of 7 in this case [see Eq. (|A5I) ]. so we would 
expect to find T cx N 2 . Indeed this is exactly what we 
observe through numerical simulations, with the mean 
fixation time growing as N 2 , exactly as in the standard 
case, regardless of the value of 7. 

Considering the fact that the smallest a values make 
the largest contribution, we also carried out simulations 
with an 'inverted' power-law distribution, P(l/a) cx 
(l/oc)~ 7 , that is P(a) cx a +1 with an imposed upper 
bound instead of a lower bound. In this case we do see 
mean fixation times changing with 7, but rather than fix- 
ation being sped up, it is slowed down. We find T cx N v ', 
with v approaching the baseline value of 2 when 7 = 3, 
and increasing as 7 decreases, as shown in Fig. 2] Here 
we do find a difference relative to other cases we investi- 
gated, in that the density of the graph also has an effect 
on the exponent v. it grows more quickly with decreasing 




200 300 400 



N 

FIG. 3: Mean fixation time as a function of population size 
with Pi depending on speaker degree, as described in Section 
IIII Bl Results are for an random network whose degree dis- 
tribution obeys a power law p(k) cx k '. The interaction 
weights depend on degree through ft ~ k% . Markers are av- 
erage fixation times for 5000 numerical runs. Solid lines are 
expected scaling as given by Eq. (|23|) and Fig. [TJ dashed lines 
are best fit curves of the form T = A/V . 
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FIG. 4: Numerical results for mean fixation time for on dis- 
tributed according to inverted power law distributions, with 
values of 7 given in the legend. Top line (squares, red on- 
line) is for a fully connected graph with 7 = 2.6. Lower 
lines (circles) are for a sparse graph with mean degree 10 and 
7 = 2.8, 2.6, 2.4 from top to bottom. Lines are fitted functions 
of the form T — aN^ . Dashed line is aN 2 for comparison. 



7 on a sparse network than a fully connected network. 

Returning to heterogeneous /3j values, we investigated 
the effect of correlations between the values of neigh- 
boring speakers. To do this, we placed the speakers on 
a random sparse network, in which each speaker has ap- 
proximately the same number of neighbors. A list of 
power law distributed /3 values was created, and the 
largest value assigned to a randomly chosen speaker. The 
next largest (3 values were then assigned to the neighbors 
of this speaker, followed by remaining second-neighbors 
and so on until all j3 values were assigned. We found 
that these correlations only slightly affected the scaling 
of mean fixation time with population size N, with T 
growing as N u with exponent v similar to that found in 
Section UlI Al for the same 7. To confirm this result, we re- 
peated the simulations now assigning \i values from low- 
est to highest, and considered anticorrelations, in which 
the lowest f3 values were located on the neighbors of the 
highest value and so on. In each case the growth of T 
with N was similar, though the overall prefactor was dif- 
ferent to that found in Section UlI Al Results are plotted 
in Fig. [5l compare with Fig. [3] This weak dependence 
of fixation times on correlations mirrors results found for 
the voter model on heterogeneous networks [TTj j . 

Finally we introduced inhomogeneity in the initial con- 
ditions. After randomly assigning /3j values exactly as in 
Sec. IIII Al speakers with the largest /3j's had their ini- 
tial grammar value Xj(0) set to f, while the remainder 
were set to 0, such that the overall mean grammar was 
xq. Our calculation assumes the largest contribution to 
mean fixation time comes from the period after the ini- 
tial relaxation to a quasi-stationary state, so the initial 
conditions would not be expected to affect the scaling 



FIG. 5: Numerical results for mean fixation time for corre- 
lated Pi. Speakers are located on a sparse network and fit 
values distributed according to a power law with exponent 
7 = 2.4, and correlated (see text) from highest to lowest (tri- 
angles), from lowest to highest (inverted triangles) and anti- 
correlated (circles). Lines are fitted functions of the form 
T = aN s , with £ = 1.44, 1.63, 1.72 respectively. For compar- 
ison the black dashed line has £ = 1.57 which is the slope 
expected for uncorrelated /3j. 



of mean fixation time with N. This was indeed found 
to be the case, with T scaling with N exactly as found 
in Sec. IIII Al The mean fixation time is affected by ini- 
tial conditions through the center-of-mass parameter £(0) 
which appears in Eq. ijHJ). This affects the prefactor but 
not the scaling of T with AT. We found that £(0) differed 
from the homogeneous case value xq, as evidenced by a 
much greater probability of fixation to state 1. 

These last numerical investigations thus support the 
value of the simpler cases for which we made analytic 
predictions. We find that they give a good indication of 
the general conditions for finding fixation times shorter 
than the standard T oc N 2 . 



V. DISCUSSION 

In this work, we have investigated how asymmetry 
in the interactions between speakers in a model of lan- 
guage change affects the time to reach a state of fixa- 
tion (all speakers using a common conventional variant). 
Although we have couched our discussion in terms of 
the utterance selection model for language change [3], 
it is worth recalling that the Fokker-Planck equation 
that describes the continuous-time limit of the dynam- 
ics, Eq. (|2J), also applies to the Wright-Fisher model for 
changes in gene frequencies in a structured population 
@, to Hubbell's model of ecological community dynam- 
ics [i[ and, in a limit where all my — > 0, to a spatially- 
structured voter model . Thus our results apply quite 
generally to models in which the state of a node on a 



9 



network evolves by copying the state of a neighboring 
node, whether through a birth-death process (as in the 
Wright-Fisher or Hubbell model) or by one agent adopt- 
ing another agent's behavior (as in the voter and utter- 
ance selection models). 

As we noted in the introduction, an appealing and use- 
ful property of the utterance selection model is that there 
is a clean and natural separation between the symmetric 
and asymmetric components of the agent interactions. 
It is assumed that agents' linguistic behavior is primar- 
ily affected by face-to-face interactions between speakers. 
Thus whenever agent i is interacting with agent j, agent 
j is interacting with agent i. This is reflected in the sym- 
metry of the matrix G, viz, dj — Gji. However it is not 
necessarily the case that the outcome of the interaction is 
the same for both speakers: agent i may be influenced to 
a greater degree by agent j than the other way round. In 
this instance Hij > Hji, which results in an asymmetric 
H matrix [371 ] . 

This formulation has allowed us to explore in a system- 
atic way the consequences of asymmetry in the dynamics 
by manipulating the H matrix while leaving the G matrix 
unchanged. This is much harder to do in the context of 
the voter model (for example) , in which the asymmetry is 
implicit in the model dynamics, rather than specified ex- 
plicitly as here. Whilst various attempts have been made 
to separate these two contributions within the context of 
the voter model, see e.g. [2{| |30| . the network structure 
and asymmetry effects have generally remained entangled 
to some degree when using the voter model as a starting 
point. 

Our main finding is that the mean time to fixation can 
be dramatically affected by the presence of large dispari- 
ties in the influence of different agents, for example, when 
the Hij are constructed to be drawn from a power-law 
distribution. We emphasize the distinction with similar 
results for the voter model on heterogeneous networks 
(e.g., [HHIIj]), m which variation in the degree of each 
node (combined with the implicit asymmetry of the voter 
model dynamics) is responsible for such effects. Here we 
find that the fixation time can be reduced relative to the 
case of uniform influence (Hij — const) even on homo- 
geneous networks. This result contrasts with those of 
[27l . |28| , in which variation in the willingness to change 
state (our a parameter) causes a slower onset of fixation, 
a result we also obtained here. 

The specific networks we examined were fully- 
connected network and sparsely-connected random 
graphs. We have found that, as in earlier work [la] , an- 
alytical predictions hold when there is a separation of 
timescales between an initial relaxation and the longer 
diffusive process that brings the system to fixation. A for- 
mal criterion for this separation of timescales is given in 
[2l| , but in practice we have found the diffusive timescale 
dominates when it grows more rapidly than linearly with 
the size of the network N. We note that this separa- 
tion of timescales is in fact seen on the two-dimensional 
square lattice (although the diffusive timescale is only a 



factor In N longer than the relaxation timescale, [31|). It 
is therefore likely that our results hold for the very large 
class of networks that satisfy the 'small-world' property, 
that is, where the longest distance between any pair of 
nodes is much smaller than the network size A, not just 
the random graphs that we considered here [2J. 

We also found that a wide variety of scaling relation- 
ships between the mean fixation time and network size 
are possible when node influence and degree (a measure 
of 'popularity') co-vary. Here we found cases where fix- 
ation may be accelerated or decelerated relative to the 
baseline case of uniform influence, depending on how in- 
fluence and degree are correlated. Our results are sum- 
marized in the phase diagram of Fig. [1] and are similar 
in spirit to those obtained in the specific context of the 
voter model on heterogeneous networks [29|, [3(| • 

Finally, we find that correlations in influence between 
neighboring nodes only weakly affects the mean time to 
fixation. This accords for example with a similar finding 
for degree correlations for the voter model on heteroge- 
neous networks [IH [2lj . in which correlations only ap- 
pear to affect prefactors in the scaling relation between 
fixation time and network size, not the scaling exponent. 
This lack of sensitivity to correlations may be due once 
again to the 'small-world' property: since a variant can 
reach any node on the network in only a few steps, the 
question of who is using it may become only a second- 
order consideration. 

Taken together with the many results for random- 
copying processes of various guises that are to be found 
in the literature, we have by now a more-or-less complete 
understanding of the factors that enter into the fixation 
time in these models. There do however remain some 
generalizations and extensions that remain to be fully ex- 
plored. Most notably, we have assumed a fixed network 
structure: it is clear that this structure may also evolve 
over time, for example, as relationships are formed and 
broken between members of a social group. Furthermore, 
all the manifestations of the model we have discussed 
share the common and crucial property of neutrality with 
respect to the different variants. That is, the probabil- 
ity that agent i adopts agent j's behavior is independent 
of what that behavior actually is: there is no selection 
in the language of genetics or ecology. While both gen- 
eralizations have been the subject of considerable study 
(e.g. [U HI| examine dynamic networks and [34| selec- 
tion in a spatial setting) the role of network structure and 
interaction asymmetry seems to be less well established 
in these cases. 
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Appendix A: Moments of Power-law Distributions 

In this paper we frequently write results in terms of 
moments of distributions of network properties. We 
are often interested in distributions with unusually large 
values, since these model situations where some of the 
speakers have atypical characteristics. In this Appendix 
we collect together results on moments of power-law dis- 
tributions, which are of this kind, that are used in the 
main text. 

Examples of quantities that we are interested in are: 
the degree ki of nodes of the network of speakers Gij 
or the matrix of the weights of utterances H^. These 
are to be sampled from a given distribution. For generic 
distributions, the moments are not expected to depend 
on the sample size N . However for 'heavy-tailed' dis- 
tributions, the range of values likely to be taken by the 
samples grows with N, and as a consequence the various 
moments grow as some power of N. 

Suppose that the probability distribution of some ran- 
dom variable q takes the form 



P(q) = Aq 7 for q > q 



(Al) 



where A, 7 and q are constants. In the limit N — >• 00 
there will be arbitrary large values of q which are sam- 
pled. In this case the range of values of q is unbounded 
(qo < q < 00) and simple integration gives the normal- 
ization constant A as A = (7 — 1)(? 7_1 (7 > 1) an d the 
n th moment fi n as 



7-1 
7-1-?/ 



96 



(A2) 



which diverges if n > 7 — 1 . 

In real applications, and in particular in this paper, we 
are interested in the case where N is finite. In this case 
we expect that there will be some upper cutoff q max that 
grows with N. The easiest way to extract the scaling of 
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this cutoff with N is to put 



7 > 3, leads to 



N 



(A3) 



motivated by the requirement that the values of q not 
seen due to finite sample-size effects will have a cumu- 
lative probability of order 1/N. Performing the integral 
and rearranging yields g max ~ TV 1 ^ 7-1 ) (see e.g. |ll|). 
More rigorously, one can compute the distribution of the 
maximum of N power-law random numbers, which has 
the Frechet form 



P N (q) ~ N(j - l)q- 



(A4) 



for large N and qo = 1 (see e.g. [35| )■ Using this dis- 
tribution, one can now calculate the mean value of the 
maximum q for a given N, which is found to scale in the 
same way as before, g ma x ~ TV 1 ^ 7-1 '. In the context of 
networks, however, there is an additional condition, in 
that we do not wish to have any multiple edges. This 
yields the upper cutoff oc N 1 / 2 for 7 < 3 (36|. Setting 
<Zmax = aN 1 /? with p = 2 for 7 < 3 and p = (7 — 1) for 



7 - 1 [q^ 1+n - a l-7+«Ar(l-7+»)/p] 
1-l-n jgpr _ a i-7AT(i-7)/p] 



(A5) 



If n < 7 — 1, the terms in Eq. (|A5[) containing N decay 
with increasing N, leading to a value for the moment 
(for sufficiently large N) close to that found in the case 
of infinite N. On the other hand, when n > 7 — 1, but 
7 > 1, the term in N in the numerator diverges, while 
that in the denominator vanishes, leaving 



-y — 1 a 1_7+ " A r ( 1 ~ 7+ ")/ p 



1-7 
% 



n- (7-1) 

In summary, for 7 > 1, the n th moment is of order 



(A6) 



■ N (l- 1+ n)/2 n >7-l 7<3 

/,„ ^ 7v( 1 - 7 +")/( 7 - 1 ) n > 7 - 1 7 > 3 (A7) 
9o ~ 1 n < 7 - 1 . 



